Point data manipulation with PDAL

http://pdal.io

PDAL (the Point Data Abstraction Library) is an open source library for handling massive point data sets. It's name is derived from GDAL - since it aims to sit in the same space for point data.

PDAL is actually a C library - if you're writing applications you can insert it into your code. It also has python bindings. Today we'll explore some of PDAL's capabilities using it's command line applications - which are mostly wrappers to PDAL's pipeline functions.

We'll also us a sneaky bit of LibLAS: http://www.liblas.org

...but you'll hopefully see why we'd prefer PDAL in the end.

Agenda for this session

A lightning speed overview of point data handling and manipulation:

  1. Getting information about a point dataset
  2. Collecting a subset from a LiDAR survey
  3. Requesting only a specific point class from a dataset
  4. Classifying ground (in case you don't like the vendor's version of 'ground')
  5. Requesting 'height above ground' instead of 'absolute height'
  6. Generating a bare earth DEM and a DSM

We will do all this on the command line, viewing results in CloudCompare or this notebook. These tasks are based on the PDAL workshop here: http://www.pdal.io/workshop/index.html, and are very much 'learn by doing'. PDAL is very well documented, please keep reading for more information.

...so feel free to zoom ahead, create and share!

Set up

module purge
module load PDAL cloudcompare

Locate data

We will use a LiDAR survey obtained over Merimbula in 2013. Here is it's catalogue entry:

THREDDS http://dapds00.nci.org.au/thredds/catalog/rr1/Elevation/Merimbula0313/catalog.html

The path to the data via the VDI is:

/g/data1/Elevation/Merimbula0313/Tiles_2k_2k

1. Basic information

Try:

pdal info /g/data1/Elevation/Merimbula0313/Tiles_2k_2k/Merimbula2013-C3-AHD_7605910_55_0002_0002.las

...and compare with:

lasinfo /g/data1/Elevation/Merimbula0313/Tiles_2k_2k/Merimbula2013-C3-AHD_7605910_55_0002_0002.las
Lasinfo gives more compact results - but can only read LAS. PDAL's info function can tell you about dimensions in any dataset it has a schema for reading: http://www.pdal.io/stages/readers.html, which hints also that PDAL can process point data in a diverse range of data formats.

2. Clipping point data with PDAL

Straight into the fire! We're going straight to PDAL's pipeline architecture, which gives it an enormous amount of felxibility and power. A pipeline is a set of operations chained together and defined in a JSON file. You'll see it in action here!

a. Why do we want to clip LAS data?

LAS tiles are pretty hard to handle - you get a lot of extra data that you may not want, and they are pretty much always boringly square. If we only need a certain region, we can get just those points using PDAL.

b. An example - selecting Merimbula town

The image here shows that Merimbula town covers several LIDAR tiles. This is an extra challenge - it means some tiles have a lot of data we don't want.

(QGIS screenshot pic)

You can see a polygon around a region of interest - it's saved as geoJSON and looks like this:

{
"type": "FeatureCollection",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:EPSG::28355" } },
"features": [
{ "type": "Feature", "properties": { "id": 0 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ 759094.480855233967304, 5913008.271593709476292 ], [ 758464.699909413931891, 5912716.199270982295275 ], [ 757743.646362751838751, 5912898.744472593069077 ], [ 757716.26458250079304, 5913304.907546310685575 ], [ 757373.992329337401316, 5913418.998297326266766 ], [ 757018.029186049010605, 5913724.761510098353028 ], [ 757556.537531022448093, 5913784.088700683787465 ], [ 757828.153738587978296, 5913997.946465536952019 ], [ 757828.153738587396219, 5914326.52782854065299 ], [ 758357.534823469701223, 5914381.291389083489776 ], [ 758877.788648267393, 5914554.709330711513758 ], [ 758850.406868015765212, 5914810.272613044828176 ], [ 759042.079329782165587, 5914837.654393311589956 ], [ 759151.606450793216936, 5914673.363711818121374 ], [ 759370.660692813224159, 5914709.872752171941102 ], [ 759361.533432727912441, 5915102.34493575617671 ], [ 760593.713544093072414, 5915138.853976195678115 ], [ 761177.858189482591115, 5915047.581375411711633 ], [ 761123.094628979102708, 5914235.255227984860539 ], [ 761260.003530243760906, 5914007.07372591085732 ], [ 761570.33037310524378, 5913952.31016543880105 ], [ 761369.530651255394332, 5913559.837981833145022 ], [ 761141.349149147979915, 5913459.438120897859335 ], [ 760484.186423085397109, 5913377.292780089192092 ], [ 759817.896436938317493, 5913632.856062367558479 ], [ 759516.696854161447845, 5913550.710721591487527 ], [ 759416.29699323582463, 5913286.020179163664579 ], [ 759094.480855233967304, 5913008.271593709476292 ] ] ] } }
]
}

...but PDAL needs WKT - using this website: http://rodic.fr/blog/online-conversion-between-geometric-formats/, we can get a WKT polygon:

POLYGON((759094.480855234 5913008.2715937095,758464.6999094139 5912716.199270982,757743.6463627518 5912898.744472593,757716.2645825008 5913304.907546311,757373.9923293374 5913418.998297326,757018.029186049 5913724.761510098,757556.5375310224 5913784.088700684,757828.153738588 5913997.946465537,757828.1537385874 5914326.527828541,758357.5348234697 5914381.2913890835,758877.7886482674 5914554.7093307115,758850.4068680158 5914810.272613045,759042.0793297822 5914837.654393312,759151.6064507932 5914673.363711818,759370.6606928132 5914709.872752172,759361.5334327279 5915102.344935756,760593.7135440931 5915138.853976196,761177.8581894826 5915047.581375412,761123.0946289791 5914235.255227985,761260.0035302438 5914007.073725911,761570.3303731052 5913952.310165439,761369.5306512554 5913559.837981833,761141.349149148 5913459.438120898,760484.1864230854 5913377.292780089,759817.8964369383 5913632.856062368,759516.6968541614 5913550.7107215915,759416.2969932358 5913286.020179164,759094.480855234 5913008.2715937095))

c. Making a list of LIDAR tiles.

We need to know which tiles contain our data. The tile index shapefile ( ) will help us to figure out which tiles we need - here they are:

Merimbula2013-C3-AHD_7565916_55_0002_0002.las
Merimbula2013-C3-AHD_7565914_55_0002_0002.las
Merimbula2013-C3-AHD_7565912_55_0002_0002.las
Merimbula2013-C3-AHD_7585916_55_0002_0002.las
Merimbula2013-C3-AHD_7585914_55_0002_0002.las
Merimbula2013-C3-AHD_7585912_55_0002_0002.las
Merimbula2013-C3-AHD_7605916_55_0002_0002.las
Merimbula2013-C3-AHD_7605914_55_0002_0002.las
Merimbula2013-C3-AHD_7605912_55_0002_0002.las

d. Constructing a PDAL pipeline

We create a JSON file which tells PDAL what to do:

nano merimbula_pipeline.json

..and paste in the following:

{
    "pipeline": [      
        { "filename": "../merimbula2013/Tiles_2k_2k/Merimbula2013-C3-AHD_7585912_55_0002_0002.las",
          "tag": "A"
        },
        { "filename": "../merimbula2013/Tiles_2k_2k/Merimbula2013-C3-AHD_7585914_55_0002_0002.las",
          "tag": "B"
        },
    { "filename": "../merimbula2013/Tiles_2k_2k/Merimbula2013-C3-AHD_7605914_55_0002_0002.las",
          "tag": "C"
        },
        {
            "inputs": ["A", "B", "C"],
            "type": "filters.crop",
            "polygon": "POLYGON((759094.480855234 5913008.2715937095,758464.6999094139 5912716.199270982,757743.6463627518 5912898.744472593,757716.2645825008 5913304.907546311,757373.9923293374 5913418.998297326,757018.029186049 5913724.761510098,757556.5375310224 5913784.088700684,757828.153738588 5913997.946465537,757828.1537385874 5914326.527828541,758357.5348234697 5914381.2913890835,758877.7886482674 5914554.7093307115,758850.4068680158 5914810.272613045,759042.0793297822 5914837.654393312,759151.6064507932 5914673.363711818,759370.6606928132 5914709.872752172,759361.5334327279 5915102.344935756,760593.7135440931 5915138.853976196,761177.8581894826 5915047.581375412,761123.0946289791 5914235.255227985,761260.0035302438 5914007.073725911,761570.3303731052 5913952.310165439,761369.5306512554 5913559.837981833,761141.349149148 5913459.438120898,760484.1864230854 5913377.292780089,759817.8964369383 5913632.856062368,759516.6968541614 5913550.7107215915,759416.2969932358 5913286.020179164,759094.480855234 5913008.2715937095))",
            "outside": false
        },
        "./merimbulatown.las"
    ]
}

e. Apply our clipping operation

Then we execute the task using:

pdal pipeline merimbula_pipeline.json

This will result in a set of points inside your polygon being written into a .LAS file at the location specified in the pipeline file. Now you have a template for doing this job with pretty much any LAS tiles!

In a new terminal, type:

cloudcompare &

...and use it's file/open menu to navigate to your newly made LAS file. Take a look at it there (hint - use the projections menu to convert the Z dimension to a scalar field to colour your points by height).

Caution

If your polygon is quite large, or your points very dense, or both, you can still get a massive dataset! Use pdal info to get an estimate of how dense the data are, and figure out how much area you are clipping to estimate the final file size before going ahead.

In [ ]:


In [ ]: